{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TPKWmjQyLo78"
   },
   "source": [
    "# Collider Bias"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "dEOrfqbbLrEB"
   },
   "source": [
    "Here is a simple mnemonic example to illustate the collider or M-bias.\n",
    "\n",
    "Here the idea is that people who get to Hollywood have to have high congenility = talent + beauty.  Funnily enough this induces a negative correlation between talents and looks, when we condition on the set of actors or celebrities.  This simple example explains an anecdotal observation that \"talent and beaty are negatively correlated\" for celebrities.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rJjtboNOHnjJ"
   },
   "outputs": [],
   "source": [
    "!pip install pgmpy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "d2oBPqULlEZ3",
    "papermill": {
     "duration": 17.004212,
     "end_time": "2021-01-21T13:24:40.500575",
     "exception": false,
     "start_time": "2021-01-21T13:24:23.496363",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import statsmodels.formula.api as smf\n",
    "import networkx as nx\n",
    "from pgmpy.base.DAG import DAG\n",
    "from pgmpy.models.BayesianModel import BayesianNetwork\n",
    "from pgmpy.inference.CausalInference import CausalInference\n",
    "import pylab as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "LKuejs6CHmqH"
   },
   "outputs": [],
   "source": [
    "digraph = nx.DiGraph([('T', 'C'),\n",
    "                      ('B', 'C')])\n",
    "g = DAG(digraph)\n",
    "\n",
    "nx.draw_planar(g, with_labels=True)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
    "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5",
    "id": "5QSQID7xlEaB",
    "papermill": {
     "duration": 3.040016,
     "end_time": "2021-01-21T13:24:44.083514",
     "exception": false,
     "start_time": "2021-01-21T13:24:41.043498",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# collider bias\n",
    "np.random.seed(123)\n",
    "num_samples = 1000000\n",
    "talent = np.random.normal(size=num_samples)\n",
    "beauty = np.random.normal(size=num_samples)\n",
    "congeniality = talent + beauty + np.random.normal(size=num_samples)  # congeniality\n",
    "cond_talent = talent[congeniality > 0]\n",
    "cond_beauty = beauty[congeniality > 0]\n",
    "data = {\"talent\": talent, \"beauty\": beauty, \"congeniality\": congeniality,\n",
    "        \"cond_talent\": cond_talent, \"cond_beauty\": cond_beauty}\n",
    "\n",
    "print(smf.ols(\"talent ~ beauty\", data).fit().summary())\n",
    "print(smf.ols(\"talent ~ beauty + congeniality\", data).fit().summary())\n",
    "print(smf.ols(\"cond_talent ~ cond_beauty\", data).fit().summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "LrEuHFPIlEaD",
    "papermill": {
     "duration": 0.008327,
     "end_time": "2021-01-21T13:24:44.101113",
     "exception": false,
     "start_time": "2021-01-21T13:24:44.092786",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "We can also use package pgmpy to illustrate collider bias, also known as M-bias."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "idUmDEQKIYGH"
   },
   "outputs": [],
   "source": [
    "inference = CausalInference(BayesianNetwork(g))\n",
    "inference.get_all_backdoor_adjustment_sets('T', 'B')\n",
    "# empty set -- we should not condition on the additional variable C."
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": [
    {
     "file_id": "1N50trfAsrV0YFY3ArfrFMnRP50jQg_6S",
     "timestamp": 1704650991448
    }
   ]
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  },
  "papermill": {
   "default_parameters": {},
   "duration": 24.263285,
   "end_time": "2021-01-21T13:24:44.323080",
   "environment_variables": {},
   "exception": null,
   "input_path": "__notebook__.ipynb",
   "output_path": "__notebook__.ipynb",
   "parameters": {},
   "start_time": "2021-01-21T13:24:20.059795",
   "version": "2.2.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
